AWS Textract OCR

AutomatR.Windows.Activities.AWSTextractOCR

The "AWS Textract OCR" activity in AutomatR integrates with Amazon Textract, a fully managed OCR service provided by Amazon Web Services (AWS). This activity facilitates the extraction of text and data from images or documents using the powerful OCR capabilities of AWS Textract.

Properties

Name	Description
*Input*
Access Key	Specifies the AWS Access Key ID associated with your AWS account. This key is used for authentication to access the Textract OCR service. `String` variables containing the AWS Access Key ID.
Bucket Name	Specifies the name of the Amazon S3 bucket where the input document or image is stored. Textract processes documents stored in this bucket. `String` variables containing the bucket name.
File Name	Specifies the name of the document or image file stored in the specified Amazon S3 bucket. Textract performs OCR on this document. `String` variables containing the file name.
File Path	Specifies the local path to the document or image file if it is stored locally. This property is used if the file is not in an Amazon S3 bucket. `String` variables containing the local file path.
Region	Specifies the AWS region where the Amazon Textract service is hosted. `String` variables containing the AWS region.
Region Selection	Allows the user to select the image region to capture by clicking on the ellipsis button (...) and dragging the mouse to define the region of interest. This is particularly useful when focusing OCR on specific areas of an image. No direct variable support for region selection, as it involves user interaction.
Secret Key	Specifies the AWS Secret Access Key associated with your AWS account. This key is used for authentication to access the Textract OCR service. `String` variables containing the AWS Secret Access Key.
*Misc*
Display Name	The display name of the activity. A display name is automatically generated when you indicate a target.
*Optional*
Delay	Specifies the amount of time (in seconds) to wait before executing the Textract OCR activity. This can be useful for handling synchronization issues. `Integer` variables containing the delay duration. Ex.: If the amount of time is 1000 milliseconds or 1 sec, i.e. 1.
*Output*
Result	Outputs the result of the AWS Textract OCR operation, typically containing the extracted text data and additional information about the document. Variables of relevant types (e.g., `string` variables) to store the OCR result.

How to use:

Drag and drop the "AWS Textract OCR" activity onto the workflow.
Configure the properties by providing the necessary AWS credentials, file information, and region details.
Use the region selection feature to define the area of interest within the image.
Optionally, configure the delay and customize the display name.
Execute the workflow to perform OCR using the AWS Textract service.

Note: Ensure that the specified AWS credentials (Access Key, Secret Key) have the necessary permissions to interact with the Amazon Textract service.

Example: Consider an example where the "AWS Textract OCR" activity is used to extract text from an image stored in an Amazon S3 bucket:

AWS Textract OCR:
  Display Name: "Extract Text from Image"
  Access Key: "your_access_key"
  Secret Key: "your_secret_key"
  Bucket Name: "your_s3_bucket"
  File Name: "sample.png"
  Region: "us-east-1"
  Region Selection: [User Interaction]
  Result: extractedText

In this example, the activity uses the AWS Textract service to extract text from the "sample.png" image file stored in the specified Amazon S3 bucket. The region of interest is interactively defined by the user through the region selection feature. The extracted text is stored in the variable "extractedText" for further use in the workflow.

AWS Textract OCR

Properties​

Properties